🛠️ All DevTools

Showing 121–140 of 4274 tools

Last Updated
April 22, 2026 at 08:00 PM

From Product Hunt

Show HN: OQP – A verification protocol for AI agents

Show HN (score: 5)

[API/SDK] Show HN: OQP – A verification protocol for AI agents As AI agents autonomously write and deploy code, there's no standard for verifying that what they shipped actually satisfies business requirements. OQP is an attempt to define that standard.<p>It's MCP-compatible and defines four core endpoints: - GET /capabilities — what can this agent verify? - GET /context/workflows — what are the business rules for this workflow? - POST /verification/execute — run a verification workflow - POST /verification/assess-risk — what is the risk of this change?<p>The analogy we keep coming back to: what OpenAPI did for REST APIs, OQP does for agentic software verification.<p>Early contributors include Philip Lew (XBOSoft) and Benjamin Young (W3C JSON-LD Working Group). Looking for feedback from engineers building on top of MCP, agent orchestration frameworks, or anyone who has felt the pain of "the agent shipped something wrong and we had no way to catch it."<p>Repo: github.com/OranproAi/open-qa-protocol

Found: April 13, 2026 ID: 4144

N-Day-Bench – Can LLMs find real vulnerabilities in real codebases?

Hacker News (score: 24)

[Testing] N-Day-Bench – Can LLMs find real vulnerabilities in real codebases? N-Day-Bench tests whether frontier LLMs can find known security vulnerabilities in real repository code. Each month it pulls fresh cases from GitHub security advisories, checks out the repo at the last commit before the patch, and gives models a sandboxed bash shell to explore the codebase.<p>Static vulnerability discovery benchmarks become outdated quickly. Cases leak into training data, and scores start measuring memorization. The monthly refresh keeps the test set ahead of contamination — or at least makes the contamination window honest.<p>Each case runs three agents: a Curator reads the advisory and builds an answer key, a Finder (the model under test) gets 24 shell steps to explore the code and write a structured report, and a Judge scores the blinded submission. The Finder never sees the patch. It starts from sink hints and must trace the bug through actual code.<p>Only repos with 10k+ stars qualify. A diversity pass prevents any single repo from dominating the set. Ambiguous advisories (merge commits, multi-repo references, unresolvable refs) are dropped.<p>Currently evaluating GPT-5.4, Claude Opus 4.6, Gemini 3.1 Pro, GLM-5.1, and Kimi K2.5. All traces are public.<p>Methodology: <a href="https://ndaybench.winfunc.com/methodology">https://ndaybench.winfunc.com/methodology</a><p>Live Leaderboard: <a href="https://ndaybench.winfunc.com/leaderboard">https://ndaybench.winfunc.com/leaderboard</a><p>Live Traces: <a href="https://ndaybench.winfunc.com/traces">https://ndaybench.winfunc.com/traces</a>

Found: April 13, 2026 ID: 4140

GitHub Stacked PRs

Hacker News (score: 349)

[Other] GitHub Stacked PRs

Found: April 13, 2026 ID: 4138

Show HN: Lythonic – Compose Python functions into data-flow pipelines

Show HN (score: 5)

[Other] Show HN: Lythonic – Compose Python functions into data-flow pipelines I was thinking about something like this for years, few trys before this. Started this repo last year and I think I got something that usable now.<p>Async framework, mix sync/async python functions, compose them into DAGs, run them, schedule them, persist data between steps or let it flow just in memory.<p>GitHub: <a href="https://github.com/walnutgeek/lythonic" rel="nofollow">https://github.com/walnutgeek/lythonic</a><p>Docs: <a href="https://walnutgeek.github.io/lythonic/" rel="nofollow">https://walnutgeek.github.io/lythonic/</a><p>PyPI: pip install lythonic<p>It is dataflow. So theoretically you can compose it with pure functions only. Lythonic requires annotations for params and returns to wire up outputs with inputs. All data saved in sqlite as json for now, and it would work for some amount of data ok.<p>You may use it as task flow keeping params and returns empty and maintaining all data outside of the flow.<p>But practically you may do well with some middle ground, just flow metadata thru, enough to make your function calls reproducible and keep some system of records that you can query reliably.<p>Anyway I will stop rambling ... soon.<p>Python 3.11+ MIT License. Minimal dependencies: Pydantic, Pyyaml, Croniter<p>Prepping for v0.1. Looking of feedback. v0.0.14 is out. Claude generated reasonable docs. Sorry, I would not be able to do it better. I am working on Web UI and practical E2E example app as well.<p>Thank you. -Sergey

Found: April 13, 2026 ID: 4160

GAIA – Open-source framework for building AI agents that run on local hardware

Hacker News (score: 126)

[Other] GAIA – Open-source framework for building AI agents that run on local hardware

Found: April 13, 2026 ID: 4143

Show HN: Ithihāsas – a character explorer for Hindu epics, built in a few hours

Hacker News (score: 126)

[Other] Show HN: Ithihāsas – a character explorer for Hindu epics, built in a few hours Hi HN!<p>I’ve always found it hard to explore the Mahābhārata and Rāmāyaṇa online. Most content is either long-form or scattered, and understanding a character like Karna or Bhishma usually means opening multiple tabs.<p>I built <a href="https://www.ithihasas.in/" rel="nofollow">https://www.ithihasas.in/</a> to solve that. It is a simple character explorer that lets you navigate the epics through people and their relationships instead of reading everything linearly.<p>This was also an experiment with Claude CLI. I was able to put together the first version in a couple of hours. It helped a lot with generating structured content and speeding up development, but UX and data consistency still needed manual work.<p>Would love feedback on the UX and whether this way of exploring mythology works for you.

Found: April 13, 2026 ID: 4141

How to make Firefox builds 17% faster

Hacker News (score: 29)

[Other] How to make Firefox builds 17% faster

Found: April 13, 2026 ID: 4135

A Python Interpreter Written in Python

Hacker News (score: 43)

[Other] A Python Interpreter Written in Python

Found: April 13, 2026 ID: 4197

Show HN: Mcptube – Karpathy's LLM Wiki idea applied to YouTube videos

Show HN (score: 10)

[Other] Show HN: Mcptube – Karpathy's LLM Wiki idea applied to YouTube videos I watch a lot of Stanford/Berkeley lectures and YouTube content on AI agents, MCP, and security. Got tired of scrubbing through hour-long videos to find one explanation. Built v1 of mcptube a few months ago. It performs transcript search and implements Q&A as an MCP server. It got traction (34 stars, my first open-source PR, some notable stargazers like CEO of Trail of Bits).<p>But v1 re-searched raw chunks from scratch every query. So I rebuilt it.<p>v2 (mcptube-vision) follows Karpathy's LLM Wiki pattern. At ingest time, it extracts transcripts, detects scene changes with ffmpeg, describes key frames via a vision model, and writes structured wiki pages. Knowledge compounds across videos rather than being re-discovered. FTS5 + a two-stage agent (narrow then reason) for retrieval.<p>MCPTube works both as CLI (BYOK) and MCP server. I tested MCPTube with Claude Code, Claude Desktop, VS Code Copilot, Cursor, and others. Zero API key needed server-side.<p>Coming soon: I am also building SaaS platform. This platform supports playlist ingestion, team wikis, etc. I like to share early access signup: <a href="https://0xchamin.github.io/mcptube/" rel="nofollow">https://0xchamin.github.io/mcptube/</a><p>Happy to discuss architecture tradeoffs — FTS5 vs vectors, file-based wiki vs DB, scene-change vs fixed-interval sampling. Give it a try via `pip install mcptube`. Also, please do star the repo if you enjoy my contribution (<a href="https://github.com/0xchamin/mcptube" rel="nofollow">https://github.com/0xchamin/mcptube</a>)

Found: April 13, 2026 ID: 4147

Show HN: Dbg – One CLI debugger for every language (AI-agent ready)

Show HN (score: 5)

[CLI Tool] Show HN: Dbg – One CLI debugger for every language (AI-agent ready) AI agents are great at writing code but blind at runtime. They guess, print, and waste tokens.<p>I built dbg to give them a real debugger experience. Since it is backend based with the few I implemented (still at basic level) it can support 15+ languages with one simple CLI (still some work needed but it is functional as it is):<p>LLDB, Delve, PDB, JDB, node inspect, rdbg, phpdbg, GHCi, etc. Profilers too (perf, pprof, cProfile, Valgrind…)<p>I also added GPU profiling via `gdbg` (CUDA, PyTorch, Triton kernels). It auto-dispatches and shares the same unified interface. (Planning to bring those advanced concepts back to the main dbg).<p>Works with Claude & Codex (probably works on others but didn't try them)<p>Quick start: ``` curl -sSf <a href="https://raw.githubusercontent.com/redknightlois/dbg/main/install.sh" rel="nofollow">https://raw.githubusercontent.com/redknightlois/dbg/main/ins...</a> | sh dbg --init claude (for claude) ```<p>Then just say: “use dbg to debug the crash in src/foo.rs”<p>Docs: <a href="https://redknightlois.github.io/dbg/" rel="nofollow">https://redknightlois.github.io/dbg/</a> GitHub (MIT Licensed): <a href="https://github.com/redknightlois/dbg" rel="nofollow">https://github.com/redknightlois/dbg</a><p>Would love feedback from anyone building agents. What languages or features are you missing most? Ping me at @federicolois on X or open issues.

Found: April 13, 2026 ID: 4151

Building a CLI for All of Cloudflare

Hacker News (score: 182)

[CLI Tool] Building a CLI for All of Cloudflare

Found: April 13, 2026 ID: 4136

Initial mainline video capture and camera support for Rockchip RK3588

Hacker News (score: 19)

[Other] Initial mainline video capture and camera support for Rockchip RK3588

Found: April 13, 2026 ID: 4132

Michigan 'digital age' bills pulled after privacy concerns raised

Hacker News (score: 180)

[Other] Michigan 'digital age' bills pulled after privacy concerns raised

Found: April 13, 2026 ID: 4137

Show HN: I built a social media management tool in 3 weeks with Claude and Codex

Hacker News (score: 148)

[Other] Show HN: I built a social media management tool in 3 weeks with Claude and Codex

Found: April 13, 2026 ID: 4133

Show HN: Equirect – a Rust VR video player

Show HN (score: 7)

[Other] Show HN: Equirect – a Rust VR video player This is almost entirely created by Claude, not me. I know some people aren't into that. I was one of them 3 months ago. Since the beginning of the year I finally started getting more serious about trying out AI. The company I work for also had an AI week with lots of training. All I can say is I'm pretty blown away. My entire life feels like it changed over the last month from someone who mostly writes code to mostly someone that prompts AI to write code. And just for a tiny bit of context, I'm 60yrs old and have been coding since 1980.<p>I get all the concerns, and I review all AI code at work and most AI code for personal projects. This one in particular though, not so much. I get that's frowned on but this is a small, limited scope, personal project. Not that I didn't pay attention, Claude did do some things in strange ways and I asked it to fix them quite often. But, conversely, I have zero rust experience, zero OpenXR experience, zero wgpu expericence, next to zero relevant Windows experience.<p>I'm guessing I spent about ~30 hours in total prompting Claude for each step. I started with "make a windows app that opens a window". Then I had it add wgpu and draw hello triangles. Then I had it add OpenXR and draw those triangles in VR. That actually took it some time as it tried to figure out how to connect a wgpu texture to the surface being drawn in OpenXR. It figured it out though, far far faster than I would have. I'd have tried to find a working example or given up.<p>I then sat on that for about a month and finally got back to it this weekend and zoomed through getting Claude to make it work. The only parts I did was make some programmer art icons.<p>I can post the prompts in the repo if anyone is interested, and assming I can find them.<p>Also in the last 2 weeks I've resurrected an old project that bit-rot. Claude got it all up to date, and fixed a bunch of bugs, and checked off a bunch of features I'd always wanted to add. I also had Claude write 2 libraries, a zip library, an rar decompression library, as well as refactor an existing zip decompression library to use some modern features. It's been really fun! For those I read the code much more than I did for this one. Still, "what I time to be alive"!

Found: April 13, 2026 ID: 4134

A Git helper tool that breaks large merges into parallelizable tasks

Hacker News (score: 26)

[Other] A Git helper tool that breaks large merges into parallelizable tasks

Found: April 13, 2026 ID: 4198

Show HN: Rekal – Long-term memory for LLMs in a single SQLite file

Show HN (score: 7)

[Database] Show HN: Rekal – Long-term memory for LLMs in a single SQLite file I got tired of repeating myself to my LLM every session. rekal is an MCP server that stores memories in SQLite and retrieves them with hybrid search (BM25 + vectors + recency decay). One file, local embeddings, no API keys.

Found: April 12, 2026 ID: 4148

I ran Gemma 4 as a local model in Codex CLI

Hacker News (score: 18)

[CLI Tool] I ran Gemma 4 as a local model in Codex CLI

Found: April 12, 2026 ID: 4131

Show HN: T4 – a versioned datastore with branching and time-travel (S3-backed)

Show HN (score: 6)

[Database] Show HN: T4 – a versioned datastore with branching and time-travel (S3-backed) Hi HN,<p>I built t4, a datastore that stores its WAL and snapshots in S3.<p>Instead of traditional storage, it writes append-only segments to object storage and reconstructs state from checkpoints + WAL.<p>A side effect of this model is that the database becomes naturally versioned: you can restore any past state, branch from any point (with copy-on-write) and replay history<p>I started this as an experiment to replace etcd in Kubernetes, but it’s evolving into a general-purpose versioned state store.<p>Curious what people think about it and appreciate any feedback!

Found: April 12, 2026 ID: 4130

Show HN: Claudraband – Claude Code for the Power User

Hacker News (score: 34)

[CLI Tool] Show HN: Claudraband – Claude Code for the Power User Hello everyone.<p>Claudraband wraps a Claude Code TUI in a controlled terminal to enable extended workflows. It uses tmux for visible controlled sessions or xterm.js for headless sessions (a little slower), but everything is mediated by an actual Claude Code TUI.<p>One example of a workflow I use now is having my current Claude Code interrogate older sessions for certain decisions it made: <a href="https://github.com/halfwhey/claudraband?tab=readme-ov-file#self-interrogation" rel="nofollow">https://github.com/halfwhey/claudraband?tab=readme-ov-file#s...</a><p>This project provides:<p>- Resumable non-interactive workflows. Essentially `claude -p` with session support: `cband continue <session-id> 'what was the result of the research?'` - HTTP server to remotely control a Claude Code session: `cband serve --port 8123` - ACP server to use with alternative frontends such as Zed or Toad (<a href="https://github.com/batrachianai/toad" rel="nofollow">https://github.com/batrachianai/toad</a>): `cband acp --model haiku`. - TypeScript library so you can integrate these workflows into your own application.<p>This exists cause I was using `tmux send-keys` heavily in a lot of my Claude Code workflows, but I wanted to streamline it.

Found: April 12, 2026 ID: 4128

Previous Page 7 of 214 Next